74 research outputs found
Hierarchical Side-Tuning for Vision Transformers
Fine-tuning pre-trained Vision Transformers (ViT) has consistently
demonstrated promising performance in the realm of visual recognition. However,
adapting large pre-trained models to various tasks poses a significant
challenge. This challenge arises from the need for each model to undergo an
independent and comprehensive fine-tuning process, leading to substantial
computational and memory demands. While recent advancements in
Parameter-efficient Transfer Learning (PETL) have demonstrated their ability to
achieve superior performance compared to full fine-tuning with a smaller subset
of parameter updates, they tend to overlook dense prediction tasks such as
object detection and segmentation. In this paper, we introduce Hierarchical
Side-Tuning (HST), a novel PETL approach that enables ViT transfer to various
downstream tasks effectively. Diverging from existing methods that exclusively
fine-tune parameters within input spaces or certain modules connected to the
backbone, we tune a lightweight and hierarchical side network (HSN) that
leverages intermediate activations extracted from the backbone and generates
multi-scale features to make predictions. To validate HST, we conducted
extensive experiments encompassing diverse visual tasks, including
classification, object detection, instance segmentation, and semantic
segmentation. Notably, our method achieves state-of-the-art average Top-1
accuracy of 76.0% on VTAB-1k, all while fine-tuning a mere 0.78M parameters.
When applied to object detection tasks on COCO testdev benchmark, HST even
surpasses full fine-tuning and obtains better performance with 49.7 box AP and
43.2 mask AP using Cascade Mask R-CNN
ESTextSpotter: Towards Better Scene Text Spotting with Explicit Synergy in Transformer
In recent years, end-to-end scene text spotting approaches are evolving to
the Transformer-based framework. While previous studies have shown the crucial
importance of the intrinsic synergy between text detection and recognition,
recent advances in Transformer-based methods usually adopt an implicit synergy
strategy with shared query, which can not fully realize the potential of these
two interactive tasks. In this paper, we argue that the explicit synergy
considering distinct characteristics of text detection and recognition can
significantly improve the performance text spotting. To this end, we introduce
a new model named Explicit Synergy-based Text Spotting Transformer framework
(ESTextSpotter), which achieves explicit synergy by modeling discriminative and
interactive features for text detection and recognition within a single
decoder. Specifically, we decompose the conventional shared query into
task-aware queries for text polygon and content, respectively. Through the
decoder with the proposed vision-language communication module, the queries
interact with each other in an explicit manner while preserving discriminative
patterns of text detection and recognition, thus improving performance
significantly. Additionally, we propose a task-aware query initialization
scheme to ensure stable training. Experimental results demonstrate that our
model significantly outperforms previous state-of-the-art methods. Code is
available at https://github.com/mxin262/ESTextSpotter.Comment: Accepted to ICCV 202
SPTS v2: Single-Point Scene Text Spotting
End-to-end scene text spotting has made significant progress due to its
intrinsic synergy between text detection and recognition. Previous methods
commonly regard manual annotations such as horizontal rectangles, rotated
rectangles, quadrangles, and polygons as a prerequisite, which are much more
expensive than using single-point. For the first time, we demonstrate that
training scene text spotting models can be achieved with an extremely low-cost
single-point annotation by the proposed framework, termed SPTS v2. SPTS v2
reserves the advantage of the auto-regressive Transformer with an Instance
Assignment Decoder (IAD) through sequentially predicting the center points of
all text instances inside the same predicting sequence, while with a Parallel
Recognition Decoder (PRD) for text recognition in parallel. These two decoders
share the same parameters and are interactively connected with a simple but
effective information transmission process to pass the gradient and
information. Comprehensive experiments on various existing benchmark datasets
demonstrate the SPTS v2 can outperform previous state-of-the-art single-point
text spotters with fewer parameters while achieving 19 faster inference
speed. Most importantly, within the scope of our SPTS v2, extensive experiments
further reveal an important phenomenon that single-point serves as the optimal
setting for the scene text spotting compared to non-point, rectangular bounding
box, and polygonal bounding box. Such an attempt provides a significant
opportunity for scene text spotting applications beyond the realms of existing
paradigms. Code will be available at https://github.com/bytedance/SPTSv2.Comment: arXiv admin note: text overlap with arXiv:2112.0791
SPTS: Single-Point Text Spotting
Existing scene text spotting (i.e., end-to-end text detection and
recognition) methods rely on costly bounding box annotations (e.g., text-line,
word-level, or character-level bounding boxes). For the first time, we
demonstrate that training scene text spotting models can be achieved with an
extremely low-cost annotation of a single-point for each instance. We propose
an end-to-end scene text spotting method that tackles scene text spotting as a
sequence prediction task. Given an image as input, we formulate the desired
detection and recognition results as a sequence of discrete tokens and use an
auto-regressive Transformer to predict the sequence. The proposed method is
simple yet effective, which can achieve state-of-the-art results on widely used
benchmarks. Most significantly, we show that the performance is not very
sensitive to the positions of the point annotation, meaning that it can be much
easier to be annotated or even be automatically generated than the bounding box
that requires precise positions. We believe that such a pioneer attempt
indicates a significant opportunity for scene text spotting applications of a
much larger scale than previously possible. The code will be publicly
available
Knockout of CAFFEOYL-COA 3-O-METHYLTRANSFERASE 6/6L enhances the S/G ratio of lignin monomers and disease resistance in Nicotiana tabacum
BackgroundNicotiana tabacum is an important economic crop, which is widely planted in the world. Lignin is very important for maintaining the physiological and stress-resistant functions of tobacco. However, higher lignin content will produce lignin gas, which is not conducive to the formation of tobacco quality. To date, how to precisely fine-tune lignin content or composition remains unclear.ResultsHere, we annotated and screened 14 CCoAOMTs in Nicotiana tabacum and obtained homozygous double mutants of CCoAOMT6 and CCoAOMT6L through CRSIPR/Cas9 technology. The phenotype showed that the double mutants have better growth than the wild type whereas the S/G ratio increased and the total sugar decreased. Resistance against the pathogen test and the extract inhibition test showed that the transgenic tobacco has stronger resistance to tobacco bacterial wilt and brown spot disease, which are infected by Ralstonia solanacearum and Alternaria alternata, respectively. The combined analysis of metabolome and transcriptome in the leaves and roots suggested that the changes of phenylpropane and terpene metabolism are mainly responsible for these phenotypes. Furthermore, the molecular docking indicated that the upregulated metabolites, such as soyasaponin Bb, improve the disease resistance due to highly stable binding with tyrosyl-tRNA synthetase targets in Ralstonia solanacearum and Alternaria alternata.ConclusionsCAFFEOYL-COA 3-O-METHYLTRANSFERASE 6/6L can regulate the S/G ratio of lignin monomers and may affect tobacco bacterial wilt and brown spot disease resistance by disturbing phenylpropane and terpene metabolism in leaves and roots of Nicotiana tabacum, such as soyasaponin Bb
Analysis of Influential Factors of Social Satisfaction in Food Industry
International audienceSocial satisfaction has become an important factor to help an enterprise succeed. A new way to measure social satisfaction called social license has been proposed recently. Based on the concept of CSR (Cooperation Social Responsibility), SLO (Social License to Operate) was put forward in the 1990s. However, the concept of SLO was just used in a limited range of industries such as the mining industry. Yet this concept has not been well developed or utilized in cases like site selecting analysis and satisfaction survey. In this study, SLO will be explained and tested in the food industry and a specific survey will be done to analyze the feasibility of this concept as well as crucial factors that influence the assess of social satisfaction. There is ample evidence suggesting that the SLO, as a measurement of social satisfaction, is quite supportive in decision making for food industry companies
The Experimental Study of Increased ICP on Cerebral Hemorrhage Rabbits with Magnetic Induction Phase Shift Method
Introduction Measuring magnetic induction phase shift (MIPS) changes as a function of cerebral hemorrhage volume has the potential for being a simple method for primary and non-contact detection of the occurrence and progress of cerebral hemorrhage. Our previous MIPS study showed that the intracranial pressure (ICP) was used as a contrast index and found the primary correlation between MIPS and ICP. Materials and Methods In this study,we theoretically deduced the approximate relationship between MIPS and ICP and carried out a comparison study between MIPS and ICP on cerebral hemorrhage in rabbits in this study. Acute cerebral hemorrhage was induced by injecting autologous blood (3 to 6mL) into the brain of rabbits in the experimental group (n=7). Results The animal experiment results showed that the MIPS decreased significantly as a function of injection volume in the experimental group and the changes of ICP and MIPS of rabbits from experimental group presented a negative correlation. We also found that the MIPS slopes of all experimental samples had a change trend from fastness to slowness with a reverse of the change of ICP. Conclusion These observations suggested that the non-contact MIPS method might be valuable and potential for monitoring acute cerebral hemorrhage and obtaining the ICP information
- …